An anomaly identification method for structural monitoring data considering spatial-temporal correlation

ABSTRACT

The present invention belongs to the technical field of health monitoring for civil structures, and an anomaly identification method considering spatial-temporal correlation is proposed for structural monitoring data. First, define current and past observation vectors for the monitoring data and pre-whiten them; second, establish a statistical correlation model for the pre-whitened current and past observation vectors to simultaneously consider the spatial-temporal correlation in the monitoring data; then, divide the model into two parts, i.e., the system-related and system-unrelated parts, and define two corresponding statistics; finally, determine the corresponding control limits of the statistics, and it can be decided that there is anomaly in the monitoring data when each of the statistics exceeds its corresponding control limit.

TECHNICAL FIELD

The present invention belongs to the technical field of healthmonitoring for civil structures, and an anomaly identification methodconsidering spatial-temporal correlation is proposed for structuralmonitoring data.

BACKGROUND

The service performance of civil structures will inevitably deterioratedue to the collective effects of long-term loadings, environmentalcorrosion and fatigue factors. Through in-depth analysis of structuralmonitoring data, the abnormal condition of structures can be discoveredin time and an accurate safety early-warning can then be provided, whichhas important practical significance for ensuring the safe operation ofcivil structures. At present, the anomaly identification of structuralmonitoring data is mainly achieved through statistical methods, whichcan be generally divided into two categories: 1) the univariate controlchart, such as the Shewhart control chart, the CUSUM control chart andso forth, which is used to establish separate control chart for themonitoring data at each measurement point to identify anomalies in themonitoring data; and 2) the multivariate statistical analysis, such asthe principal component analysis, the independent component analysis andso forth, which employs the correlation between monitoring data atmultiple measurement points to establish a statistical model, and thendefines corresponding statistics to identify anomalies in the monitoringdata.

Due to the deformation continuity of structures, there existscorrelation (i.e., cross-correlation or spatial correlation) betweenstructural response data at the adjacent measurement points. Inpractical engineering applications, multivariate statistical analysis ismore advantageous since the cross-correlation can be considered. Inaddition, this kind of method only needs to define 1 or 2 statistics todecide whether there is anomaly in the monitoring data, which is veryconvenient for the structural health monitoring system including manysensors. In addition to cross-correlation, there is autocorrelation(i.e., temporal correlation) in structural response data. If theautocorrelation and cross-correlation (i.e., the spatial-temporalcorrelation) can be considered simultaneously in the statisticalmodeling process, the anomaly identification ability of the multivariatestatistical analysis method can be improved, making it more practical inengineering applications.

SUMMARY

The present invention aims to propose a statistical modeling methodwhich considers the spatial-temporal correlation simultaneously, basedon that statistics are defined to identify anomalies in the structuralmonitoring data. The technical solution of the present invention is asfollows: first, define current and past observation vectors for themonitoring data and pre-whiten them; second, establish a statisticalcorrelation model for the pre-whitened current and past observationvectors to simultaneously consider the spatial-temporal correlation inthe monitoring data; then, divide the model into two parts, i.e., thesystem-related and system-unrelated parts, and define two correspondingstatistics; finally, determine the corresponding control limits of thestatistics, and it can be decided that there is anomaly in themonitoring data when each of the statistics exceeds its correspondingcontrol limit.

An anomaly identification method for structural monitoring dataconsidering spatial-temporal correlation, the specific steps of whichare as follows:

Step 1: Monitoring data preprocessing

(1) Define current and past observation vectors for the normalmonitoring data:

y ^(c)(t)=y(t)

y ^(p)(t)=[y ^(T)(t−1), y ^(T)(t−2), . . . , y ^(T)(t−l)]^(T)

where y(t)∈

^(m) represents the sample at time t in the normal monitoring data, andm represents the number of measured variables; y^(c)(t) and y^(p)(t)represent the current and past observation vectors defined at time t,respectively; l represents the time-lag;

(2) Pre-whiten the current observation vector y^(c)(t) and pastobservation vector y^(p)(t):

{tilde over (y)} ^(c)(t)=R ^(c) y ^(c)(t)

{tilde over (y)} ^(p)(t)=R ^(p) y ^(p)(t)

where R^(c) and R^(p) represent the pre-whitening matrices correspondingto y^(c)(t) and y^(p)(t), respectively; {tilde over (y)}^(c)(t) and{tilde over (y)}^(p)(t) represent the pre-whitened current and pastobservation vectors, respectively;

Step 2: Spatial-temporal correlation modeling

(3) Establish the spatial-temporal correlation model for the normalmonitoring data, that is, establish the statistical correlation modelbetween {tilde over (y)}^(c)(t) and {tilde over (y)}^(p)(t) as follows:

({tilde over (C)} _(pp) ⁻¹ {tilde over (C)} _(pc) {tilde over (C)} _(cc)⁻¹ {tilde over (C)} _(cp))φ=λ²φ

({tilde over (C)} _(cc) ⁻¹ {tilde over (C)} _(cp) {tilde over (C)} _(pp)⁻¹ {tilde over (C)} _(pc))ψ=λ²ψ

where {tilde over (C)}_(pp)=E{{tilde over (y)}^(p){tilde over (y)}^(pT)}and {tilde over (C)}_(cc)=E{{tilde over (y)}^(c){tilde over (y)}^(cT)}represent the auto-covariance matrices of {tilde over (y)}^(p)(t) and{tilde over (y)}^(c)(t), respectively; {tilde over (C)}_(pc)=E{{tildeover (y)}^(p){tilde over (y)}^(cT)} and {tilde over (C)}_(cp)=E{{tildeover (y)}^(c){tilde over (y)}^(pT)} represent the cross-covariancematrices of {tilde over (y)}^(p)(t) and {tilde over (y)}^(c)(t),respectively;

(4) Since {tilde over (y)}^(p)(t) and {tilde over (y)}^(c)(t) arepre-whitened data, {tilde over (C)}_(pp) and {tilde over (C)}_(cc) areboth identity matrices; by additionally considering {tilde over(C)}_(pc) ^(T)={tilde over (C)}_(cp) and {tilde over (C)}_(cp)^(T)={tilde over (C)}_(pc), the statistical correlation model between{tilde over (y)}^(p)(t) and {tilde over (y)}^(c)(t) can be furthersimplified as follows:

({tilde over (C)} _(pc) {tilde over (C)} _(pc) ^(T))φ=λ²φ

({tilde over (C)} _(cp) {tilde over (C)} _(cp) ^(T))ψ=λ²ψ

(5) The solution of the above statistical correlation model can beobtained by the following singular value decomposition:

{tilde over (C)} _(pc) =E{{tilde over (y)} ^(p) {tilde over (y)}^(cT)}=ΦΛΨ

where Φ=[φ₁, φ₂ . . . , φ_(ml)]∈

^(ml×ml) and Ψ=[ψ₁, ψ₂ . . . , ψ_(m)]∈

^(m×m) represent matrices consisting all left and right singularvectors, respectively; Λ∈

^(ml×m) represents the singular value matrix, in which the in non-zerosingular values are correlation coefficients between {tilde over(y)}^(p)(t) and {tilde over (y)}^(c)(t);

(6) Define the projection of {tilde over (y)}^(p)(t) on Φ, termed asz(t), which can be obtained by:

z(t)=Φ^(T) {tilde over (y)} ^(p)(t)=Φ^(T) R ^(p) y ^(p)(t)=Qy ^(p)(t)

where Q=Φ^(T)R^(p);

Step 3: Define statistics

(7) Since there are only m non-zero correlation coefficients, thevariables in z(t) can be divided into two parts:

z _(s)(t)=Q _(s) y ^(p)(t)

z _(n)(t)=Q _(n) y ^(p)(t)

where z_(s)(t) and z_(n)(t) represent the system-related andsystem-unrelated parts of z(t), respectively; Q_(s) and Q_(n) representthe first m rows and last m(l−1) rows of Q, respectively;

(8) To identify anomalies in the monitoring data, two statistics can bedefined for z_(s)(t) and z_(n)(t):

H _(s) ² =z _(s) ^(T) z _(s) =y ^(pT)(Q _(s) ^(T) Q _(s))y ^(p)

H _(n) ² =z _(n) ^(T) z _(n) =y ^(pT)(Q _(n) ^(T) Q _(n))y ^(p)

For the newly acquired monitoring data, the past observation vectory^(p) is constructed firstly; the two corresponding statistics, i.e.,H_(s) ² and H_(n) ², is then calculated, respectively; it can be decidedthat there exist anomalies in the monitoring data when each of thestatistics exceeds its corresponding control limit;

Step 4: Determine control limits

(9) If the monitoring data is Gaussian distributed, the two statisticsH_(s) ² and H_(n) ² theoretically follow the F-distribution, and thetheoretical values of the control limits are determined as:

${H_{s,{{li}\; m}}^{s}(\alpha)} \approx {\frac{m\left( {{m^{2}l^{2}} - 1} \right)}{m\; {l\left( {{m\; l} - m} \right)}}{F_{m,{{m\; l} - m}}(\alpha)}}$${H_{n,{l\; i\; m}}^{2}(\alpha)} \approx {\frac{{m\left( {l - 1} \right)}\left( {{m^{2}l^{2}} - 1} \right)}{m^{2}l}{F_{{{m\; l} - m},m}(\alpha)}}$

where H_(s,lim) ² and H_(n,lim) ² represent the control limits ofstatistics H_(s) ² and H_(n) ², respectively; α represents thesignificance level, it is generally set to 0.01;

(10) If the monitoring data is not Gaussian distributed, the probabilitydensity distributions of the two statistics H_(s) ² and H_(n) ² can beseparately estimated by other methods, and then the control limits aredetermined according to the given significance level.

The present invention has the beneficial effect that: thespatial-temporal correlation of structural monitoring data are takeninto account in the process of statistical modeling, based on that thedefined statistics can effectively identify anomalies in the monitoringdata.

DETAILED DESCRIPTION

Take a two-span highway bridge model, with a length of 5.5 m and a widthof 1.8 m, as an example. A finite element model is built to simulatestructural responses, and the responses at 16 finite element nodes areacquired as monitoring data. There are two datasets generated: thetraining dataset and the testing dataset; the training dataset consistsof normal monitoring data, and part of the testing dataset is used tosimulate abnormal monitoring data; both datasets last for 80 s and thesampling frequency is 256 Hz. The key of the present invention lies inthe spatial-temporal correlation modeling process for the structuralmonitoring data, as shown in the following schematic:

(1) Construct the current observation vector y^(c)(t) and the pastobservation vector y^(p)(t) for each data point in the training dataset;then pre-whiten all current and past observation vectors (i.e., y^(c)(t)and y^(p)(t)) to obtain the whitening matrices (i.e., R^(c) and R^(p))and the pre-whitened data (i.e., {tilde over (y)}^(c)(t) and {tilde over(y)}^(p)(t)).

(2) Establish spatial-temporal correlation model for the trainingdataset, that is, build a statistical correlation model for {tilde over(y)}^(c)(t) and {tilde over (y)}^(p)(t) to obtain the model parametersQ=Φ^(T)R^(p) and Λ; since there are only 16 non-zero correlationcoefficients in Λ, the first 16 rows of the matrix Q are used toconstruct Q_(s) and the others are used to construct Q_(n).

(3) Determine the control limits of the statistics, i.e., H_(s,lim) ²and H_(n,lim) ²; after new monitoring data is acquired, the pastobservation vector is first constructed, and then the two statistics,i.e., H_(s) ²=y^(pT)(Q_(s) ^(T)Q_(s))y^(p) and H_(n) ²=y^(pT)(Q_(n)^(T)Q_(n))y^(p), are calculated; it can be decided that there existanomalies in the monitoring data when each of the two statistics exceedsits corresponding control limit.

(4) Simulate abnormal monitoring data in the testing dataset, that is,the monitoring data of sensor 3 gains anomaly during time 40˜80 s;identify anomalies in the monitoring data using the two proposedstatistics H_(s) ² and H_(n) ², results show that both H_(s) ² and H_(n)² can successfully identify anomalies in the monitoring data.

1. An anomaly identification method for structural monitoring dataconsidering spatial-temporal correlation, wherein, the specific steps ofwhich are as follows: Step 1: Monitoring data preprocessing (1) Definecurrent and past observation vectors for the normal monitoring data:y ^(c)(t)=y(t)y ^(p)(t)=[y ^(T)(t−1), y ^(T)(t−2), . . . , y ^(T)(t−l)]^(T) wherey(t)∈

^(m) represents the sample at time t in the normal monitoring data, andm represents the number of measured variables; y^(c)(t) and y^(p)(t)represent the current and past observation vectors defined at time t,respectively; l represents the time-lag; (2) Pre-whiten the currentobservation vector y^(c)(t) and past observation vector y^(p)(t):{tilde over (y)} ^(c)(t)=R ^(c) y ^(c)(t){tilde over (y)} ^(p)(t)=R ^(p) y ^(p)(t) where R^(c) and R^(p)represent the pre-whitening matrices corresponding to y^(c)(t) andy^(p)(t), respectively; {tilde over (y)}^(c)(t) and {tilde over(y)}^(p)(t) represent the pre-whitened current and past observationvectors, respectively; Step 2: Spatial-temporal correlation modeling (3)Establish the spatial-temporal correlation model for the normalmonitoring data, that is, establish the statistical correlation modelbetween {tilde over (y)}^(c)(t) and {tilde over (y)}^(p) (t) as follows:({tilde over (C)} _(pp) ⁻¹ {tilde over (C)} _(pc) {tilde over (C)} _(cc)⁻¹ {tilde over (C)} _(cp))φ=λ²φ({tilde over (C)} _(cc) ⁻¹ {tilde over (C)} _(cp) {tilde over (C)} _(pp)⁻¹ {tilde over (C)} _(pc))ψ=λ²ψ where {tilde over (C)}_(pp)=E{{tildeover (y)}^(p){tilde over (y)}^(pT)} and {tilde over (C)}_(cc)=E{{tildeover (y)}^(c){tilde over (y)}^(cT)} represent the auto-covariancematrices of {tilde over (y)}^(p)(t) and {tilde over (y)}^(c)(t),respectively; {tilde over (C)}_(pc)=E{{tilde over (y)}^(p){tilde over(y)}^(cT)} and {tilde over (C)}_(cp)=E{{tilde over (y)}^(c){tilde over(y)}^(pT)} represent the cross-covariance matrices of {tilde over(y)}^(p)(t) and {tilde over (y)}^(c)(t), respectively; (4) Since {tildeover (y)}^(p)(t) and {tilde over (y)}^(c)(t) are pre-whitened data,{tilde over (C)}_(pp) and {tilde over (C)}_(cc) are both identitymatrices; by additionally considering {tilde over (C)}_(pc) ^(T)={tildeover (C)}_(cp) and {tilde over (C)}_(cp) ^(T)={tilde over (C)}_(pc), thestatistical correlation model between {tilde over (y)}^(p)(t) and {tildeover (y)}^(c)(t) can be further simplified as follows:({tilde over (C)} _(pc) {tilde over (C)} _(pc) ^(T))φ=λ²φ({tilde over (C)} _(cp) {tilde over (C)} _(cp) ^(T))ψ=λ²ψ (5) Thesolution of the above statistical correlation model can be obtained bythe following singular value decomposition:{tilde over (C)} _(pc) =E{{tilde over (y)} ^(p) {tilde over (y)}^(cT)}=ΦΛΨ^(T) where Φ=[φ₁, φ₂ . . . , φ_(ml)]∈

^(ml×ml) and Ψ=[ψ₁, ψ₂ . . . , ψ_(m)]∈

^(m×m) represent matrices consisting all left and right singularvectors, respectively; Λ∈

^(ml×m) represents the singular value matrix, in which the m non-zerosingular values are correlation coefficients between {tilde over(y)}^(p)(t) and {tilde over (y)}^(c)(t); (6) Define the projection of{tilde over (y)}^(p)(t) on Φ, termed as z(t), which can be obtained by:z(t)=Φ^(T) {tilde over (y)} ^(p)(t)=Φ^(T) R ^(p) y ^(p)(t)=Qy ^(p)(t)where Q=Φ^(T)R^(p); Step 3: Define statistics (7) Since there are only mnon-zero correlation coefficients, the variables in z(t) can be dividedinto two parts:z _(s)(t)=Q _(s) y ^(p)(t)z _(n)(t)=Q _(n) y ^(p)(t) where z_(s)(t) and represent thesystem-related and system-unrelated parts of z(t), respectively; Q_(s)and Q_(n) represent the first m rows and last m(l−1) rows of Q,respectively; (8) To identify anomalies in the monitoring data, twostatistics can be defined for z_(s)(t) and z_(n)(t):H _(s) ² =z _(s) ^(T) z _(s) =y ^(pT)(Q _(s) ^(T) Q _(s))y ^(P)H _(n) ² =z _(n) ^(T) z _(n) =y ^(pT)(Q _(n) ^(T) Q _(n))y ^(P) For thenewly acquired monitoring data, the past observation vector y^(p) isconstructed firstly; the two corresponding statistics H_(s) ² and H_(n)² are then calculated, respectively; it can be decided that there existanomalies in the monitoring data when each of the statistics exceeds itscorresponding control limit; Step 4: Determine control limits (9) If themonitoring data is Gaussian distributed, the two statistics H_(s) ² andH_(n) ² theoretically follow the F-distribution, and the theoreticalvalues of the control limits are determined as:${H_{s,{{li}\; m}}^{s}(\alpha)} \approx {\frac{m\left( {{m^{2}l^{2}} - 1} \right)}{m\; {l\left( {{m\; l} - m} \right)}}{F_{m,{{m\; l} - m}}(\alpha)}}$${H_{n,{l\; i\; m}}^{2}(\alpha)} \approx {\frac{{m\left( {l - 1} \right)}\left( {{m^{2}l^{2}} - 1} \right)}{m^{2}l}{F_{{{m\; l} - m},m}(\alpha)}}$where H_(s,lim) ² and H_(n,lim) ² represent the control limits ofstatistics H_(s) ² and H_(n) ², respectively; α represents thesignificance level, it is generally set to 0.01; (10) If the monitoringdata is not Gaussian distributed, the probability density distributionsof the two statistics H_(s) ² and H_(n) ² can be separately estimated byother methods, and then the control limits are determined according tothe given significance level.